Goto

Collaborating Authors

 zero-shot cost model


Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction

arXiv.org Artificial Intelligence

In this paper, we introduce zero-shot cost models which enable learned cost estimation that generalizes to unseen databases. In contrast to state-of-the-art workload-driven approaches which require to execute a large set of training queries on every new database, zero-shot cost models thus allow to instantiate a learned cost model out-of-the-box without expensive training data collection. To enable such zero-shot cost models, we suggest a new learning paradigm based on pre-trained cost models. As core contributions to support the transfer of such a pre-trained cost model to unseen databases, we introduce a new model architecture and representation technique for encoding query workloads as input to those models. As we will show in our evaluation, zero-shot cost estimation can provide more accurate cost estimates than state-of-the-art models for a wide range of (real-world) databases without requiring any query executions on unseen databases. Furthermore, we show that zero-shot cost models can be used in a few-shot mode that further improves their quality by retraining them just with a small number of additional training queries on the unseen database.


One Model to Rule them All: Towards Zero-Shot Learning for Databases

arXiv.org Artificial Intelligence

And unfortunately, the training data collection needs to be repeated for every new database that needs to be supported. In this paper, we present our vision of so called zero-shot learning To reduce the high cost of training data collection, reinforcement for databases which is a new learning approach for database learning (RL) has been used to execute training queries [10, 17, 18, components. Zero-shot learning for databases is inspired by recent 34] in a more targeted manner (i.e., letting the RL agent decide advances in transfer learning of models such as GPT-3 and can which queries to execute next). However, even with reinforcement support a new database out-of-the box without the need to train a learning still a large amount of training queries needs to be executed new model. As a first concrete contribution in this paper, we show for learning a model. Moreover, training the model is not a onetime the feasibility of zero-shot learning for the task of physical cost effort since similar to workload-driven approaches the learning estimation and present very promising initial results. Moreover, procedure needs to be repeated for every new database at hand. as a second contribution we discuss the core challenges related to A different direction that has thus been proposed to avoid the zero-shot learning for databases and present a roadmap to extend expensive training data collection by running queries on a new zero-shot learning towards many other tasks beyond cost estimation database are so called data-driven approaches [11, 31, 32] that learn or even beyond classical database systems and workloads.